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ABSTRACT 



It is well known that the standard estimator for variance 

^2 

components in analysis of variance. Model II, o , can be 

3 . 

negative with positive probability. In practice, when such 
an estimator is found to be negative it is taken to be zero. 
Very little is known about the properties of the correspond- 
ing truncated estimator. This thesis investigates the vari- 

~2 

ance and bias of the positive truncated estimator . A 

method of selecting i, the number of classes, is presented 

that produces maximum power for a test of the hypothesis that 
2 

Ua = 0 while keeping the variance and bias as small as 
possible. 
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I. INTRODUCTION 



The use of Model II or the Random Effects Model in anal- 
ysis of variance can best be described by a simple example. 
Suppose we draw a sample of I pieces of steel from the popula- 
tion of pieces which have been subjected to a particular 
annealing process. These % pieces may be considered as a 
random sample from the population composed of all such pieces 
of steel which have been or will be produced by this specific 
process. We might wish to determine the variation of flex- 
ural rigidity after the annealing process between the various 
members of the whole population. If exact measurements of 
flexural rigidity could be taken from the pieces on hand, the 
variance could be derived from straight forward statistical 
methods. However, the experimental methods used to measure 
flexural rigidity are subject to error. This error is re- 
flected in the fact that if several measurements are taken 
from one piece of steel, the results are not always exactly 
the same. In fact, it may be the case that the measurement 
(experimental) errors are of the same or greater magnitude 
than the variation we wish to measure between the true rigid- 
ities of the different pieces. Using analysis of variance - 
Model II, it is possible to separate and isolate these two 
different causes of variation and to obtain an estimate of 
the true variation of rigidity. 

The data for such an analysis will consist of several 
different measurements of flexural rigidity taken from each 
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of the £ pieces of steel. If we take r measurements on each 
of the £ pieces, then the total number of measurements will 
be £r = N. 

The Model II analysis of variance now takes the form 



Yij = y + + e^j, i=l,2, •••£, j=l,2/*r. (1.1) 

The following assumptions and definitions are standard 
for this model: 

th th. 

represents the j measurement of the i^ piece 

of steel. 



y is the "true" mean flexural rigidity of the population 
and is assumed constant. 



th 

a. is the deviation from the mean of the i piece. 

^ 2 
The a. are assumed to be distributed N (0, a ), 

X 3i 

th 

e. . is the measurement error of the j measurement 
. th . 

Oh the 1 piece. The e. . are assumed to be 

2 

distributed N (0, a ), 

e 

a^^ and e^j are assumed independent. 

For the balanced one-way classification Model II analysis 



of variance just described, it is well known that the minimum 
variance unbiased estimator for the true variation between 
the pieces is 




MS - MS 
a e 



r 



( 1 . 2 ) 



£ 

where MS = Y r(y. 

a i4l ^ 



y)^/U-l) 



and MS 

e 



£ r 

I I Cy^. 

i=l j=l 



y^)/£(r-l) . 



Leone and Nelson [Ref. 3] found from an empirical study 
that this estimator can be negative with probability as high 
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as .40. In practical applications the estimator is taken as 
zero whenever it is negative. This then produces a trunca- 
tion of the true distribution of the minimum variance un- 
biased estimator. The truncated estimator takes the form 




if MS > MS^ 
a •“ e 



0 



otherwise . 



(1.3) 



The properties of this truncated estimator are unknown at 
present. 

Consider a situation where N, the total number of exper- 
iments, is fixed. Within this framework, this paper is con- 
cerned with an empirical investigation of the properties of 
~ 2 

the estimator a 

a 

The following questions are considered: 

1. What can be said concerning the effects of 

various choices of I and r on the bias and 
~2 

variance of a ? 

a 

~2 

2. How does the variance of a compare with the 

■^2 ^ 

variance of for a given N and Z? 

3. Can an allocation method for Z be found to 
yield minimum variance or minimum bias for 




4. If such an allocation method can be developed, 
how does it compare with the allocation formula 
for r developed by Hammersley [Ref. 1] to mini- 

.2 . 

mize the variance of o when K = — y is known? 

3 . ^ 

^e 

5. If nothing is known about K is there a "best" 
allocation method for Z? 
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t 



6. If we are testing the null hypothesis H : 

2 2 ° 
o =0 against H, : a 0 , how does the alloca- 

d. X 3. 

tion of I affect the power of this test? 
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II. METHODOLOGY 



A. THEORY 



In order to investigate the behavior of the estimator it 

"~2 

is first necessary to develop the distribution of and 
expressions for the bias and variance of this estimator. 



1. Distribution of o 



t2 



Let U and V be two independent random variables with 
Pearson Type III distribution so that 



fCu) = 



and f (v) = 



Ti+1 



r(Tj^+i) 



Tl -YiU 

u e 



Y- 






r(T2+i) 



V 



TjY -YjV 



If u > 0 
Otherwise 

if V > 0 
otherwise . 



Pearson [Ref. 4] found the distribution of Y = U - V to be 



g(y) = 



Ti+l 

Yi Y2 YjV ^2 ,, . 

e 



r(T^+i) (Y1+Y2) 



Ti+1 



^ 11 (Yi+Y2)y 



(t_-1) (t^+1) (t,+2) 

+ i ± + •••] if y > 0 

2KY1+Y2) y 



(2.1a) 
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I 




g(y) = 



T +1 T +1 

Yi Y2 



r(t 2 +i) (Y1+Y2) 



YyY (t,+1) 

® ■1TCY1+Y2) (-y) 



(t - 1 ) (t + 1 ) (T + 2 ) 

+ -= = ^ = + • 



2!(Yi+Y2)^(-y)^ 



He also showed that 



] if y < 0 . 

(2.1b) 



g(y)dy = (t^+i, t^+d , 



( 2 . 2 ) 






T +1 ly 

yg(y)dy = — (x^+1, T^+l) 

T.+l lY 






( 2 . 3 ) 



and 



\ e^yg(y)dy = (1-^) ^ (1+.^) ^ 

't ^2 ^2 



lY,+t 

^(T^tl, X^-Hl) , ( 2 . 4 ) 

where I^(p,q) is the ratio of the incomplete beta function 
to the complete beta function. 

X m 1 

If we choose Y2 = 2b''’^i^2’“^'"*'2“2¥ (2.5) 

and "^2 = ^ - 1 / 

and substitute these values into 2.1, we obtain the density 
function of Y = aX^^ - bX 2 , where and X 2 are independent 
chi-square variables with n and m degrees of freedom respec- 
tively and a and b are positive constants. This density 
function becomes 
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1 



1 



n fTi-ra. 

r(j)2^a ^ (a+b) 



n 

-y/2a 2 
e ■" y 



2F^[^ , (1 - J) ; 



g(y) = 



-2ab 
(a+b) y 



] f y > 0 



n 






m , m-n . n 

p .m. o2, ' 2 - . u\ 2 

T{j)2 b (a+b) 



-y/2a, .2 

i ^ (-y) 



- 1 



2F^[| , (1 - |) ; 



2ab 



(a+b) y 



] f y < 0 



where 



(2.6) 



2F^[p,q;X] 



^ <P>n“3>a h 

n=0 



n 



with 



(a)^^ = a(a+l) 



Now let 



/ , IX r (a+n) 

• * = —rcKT 



Y+ = 



if Y > 0 



otheirwise . 



The distribution for Y will be 



rr 



Hy+(y) = ^ 



g(y)dy 



, y < 0 

^ y = 0 

y'*’ 

^ g(y)dy , y > 0 , 



(2.7) 
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f 





and using Eq. (2.6) we get the density function 



hy+(Y) 




, Y = 0 

/ y 0 



, otherwise. 



( 2 . 8 ) 



From equations (2.3)* (2.4), (2.5), and (2.8) it follows that 



E(Y'^) = na I ^ t) - mb I ^ (^ + ^ _ 1 ) , 



a '2' 1' a ''2 ‘ 2 • (2.9) 



a+b 



a+b 



and 



ty+> 



E(e-') = 4. (l-2at)‘(H-2bt)^I^(^^2^,^,(-, ±). 

a+b 



n 

’2 



m 

‘ 2 . 



,m n. 



a+b 



(2.10) 



~2 + 

The distribution of a is the same as that of Y with 

d. 

n = 1-1 m = £ (r-1) , 



+ ra^ 

^ = - ? rr -T)^ = ET C Ay • 



( 2 . 11 ) 



~2 

2. Variance and Bias of a- • 

d 

^2 

As indxcated above, the distribution of a is the same 

' a 

as the distribution of y'*’ for the proper choice of n, m, a. 



and b. Thus, Eqs . (2.9) and (2.10) give the expected value 

~2 -2 . 

of a and its moment generating function when a is substi- 

5i d. 

tuted for y"*" and m, n, a, and b are defined as in Eq. (2.11) . 

~2 

The variance of can now be derived by recalling 



that 



11 



( 2 . 12 ) 



Var (a^) 
a 



A _ dM 2 

I 2 ^ * 

dt t=0 



where M = 



E(e^'^a) 



Applying successive derivitives to Eq. (2.10) and evalu- 

-2 

atxng at t = 0 , the expected value of is found to be 



„,Z2. dM| T -m n, , _ 

= at't=0 = “ ^ a 'r ' - ”*> I a 



,m n. 
’ T 



i+b 



a+b 



— - 1 2.-1 

^ 2ab / a ^ 2 / b ^ 2 ,m n^ 

+ i[+b ^i+F^ ^i+F^ /^^2 ' 2^ 



(2.13) 



where 3 (^, ^) is the beta function with parameters m/2 and 



n/2, and = (a^n^ +2a^n~ 2abmn+2b^m+b^m^) I ^ 



dt t=0 



a 2 
a+b 



21-1 - 1 

+ IIe + 2a - 2b) <51^)2 (^)2 /BiJ , ") . 

(2.14) 

Equation (2.13) can be shown to be equivalent to Eq. (2.9) . 



Squaring Eq. (2.13) and subtracting from Eq. (2.14), the 
~2 

variance of a is obtained as 
a 

Var ( 0 ^) = 2(na^ + mb^) I ^ (t / t) 
a. a. z z 



a+F 



+ (na - mb) [na I (t / - mb I (^+1,^-1)] 



m 



n 



a ^2 ' 2 



a 2 



a+b 



a+b 



m 



n 



+ ^<5+E> '5TEr> 



^ - 1 



b(a-b)/3(J,j)-[na I ^ (j,|) 

a+b 



-mb I a (j + 1. J 

a+b 



2 

1 ) ] . 



(2.15) 
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flf 







From Eq. (2.9), the bias of the estimator is given by 



i_ * 1" ^ ^ \ -1 f iTi , ■« ri « \ 2 

bias = na I 55 ^ (j. , j.) - mb I a (j- + 1, j. - 1) - a 

a+b 



(2.16) 



/V 2 

The expression for the variance of is 



. ^2.^ A 
cr + rc ) o 

,^ 2 . 2 , e a' . e 

Var (a ) = -^[ ^ yx 

^ r^ r^(£-l) 



] . 



(2.17) 



Thus, values for the variance of the minimum variance un- 

biased estimator, J , can be computed for a comparison with 

~2 

the variance of a for fixed N and % . 

d. 



3. Pov/er 

In considering the problem of selecting an i for a 

fixed N when testing a given hypothesis based on the sample, 

the power of the test is an important consideration. Suppose 

2 

the null hypothesis Ho: a = 0 is being tested against the 

d. 

2 

alternative hypothesis 7 ^ 0 from a sample of Z classes, 

each class consisting of r observations. From the analysis 



of variance table, the test statistic is found to be ^ = 



MS 



SS 



SS 



MS ' 
e 



whe re MS = 



a jl -1 



and MS = 



e Jl (r- 1 ) • 



SS SS 

It may be shown that a e 

7^ + ro^ 



are 



independent chi-square variables with 5, - 1 and £{r-l) degrees 

MS 

of freedom respectively. The statistic — may now be re- 



written as 
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ss 



{ 



2 2 

a + ra 

e a 






SS 

(— §■) A(r-l) 
e 



(2.18) 



and is a ratio of independent chi-square variables divided 

by their degrees of freedom and is distributed as 

F,„ , T\ » where F , is a central F variable with 

(Jl-1) , £(r-l) ' a,b 

a and b degrees of freedom. 



If H is true, i.e., a = 0 the test statistic 

o '' a 

SS^/(Jl-l) 

^ ' SS ^/H 7 -T ) distributed as . 

Thus, a test of the null hypothesis consists of re- 
jecting at a level of significance a , if 

^ a-l) , ^,(r-l) ’ 

The power of this test, denoted B{9) , is given by 



MS 



BO) 



a 



“ ^ ^a;(Jl-l), il(r-l)^ where 9 



But if a ^ 0 , then 

3. 



MS 



2 2 2 
a + ra a F 

* ■ S 3 



g; (£-1) , £(r-l) ^ 



2 

- PTF > Mr-1) . 

- , Jl(r-l) 2 ^ 2 

e a 
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This expression can be evaluated after the transformation 



Y = 



^ Y2 ^y1,Y2 



The variable Y is distributed as 3 (Yj^/ Y 2 ) • 

Thus 

eO) = P[- 



1 + ^ “ 1.. F 
^ £(r - 1) ^ £-l,il(r-l) 1 + 



(Jl-l)a^ F 0 . 0 , .. 

e g;£-l,Jl(r-l) 

iir-1) {ol + ra^) 

c d 



or 



where 



3(6) = Ij^[£-1, il(r-l) ] 



X = 



^g; (g--l) ,£(r-l) 

Jl(r-l) (a^ + ra^) 
e a 



1 + 



yields the power of the test of hypothesis H^: = 0 , for a 

specified a and N. 



B. DATA GENERATION 

From Eqs . (2.9), (2.15), (2.16), and (2.19) it can be 

seen that each of the properties to be analyzed is dependent 

2 2 

on four variables: a , a , r and 1. Recall that N = Jlr is 

e a 

the total number of experiments to be conducted. If N is 

fixed the choice of either il or r determines the other. Thus, 

2 2 

for fixed N, we have only three variables, I, and . We 

0 d 

'^2 

now wish to see what happens to the bias and variance of 
and the power of the specified test of hypothesis as the three 
variables take on a range of values. 
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Calculation of these statistics was done on an IBM 360/67 

computer using the basic program shown in Appendix A. In 

addition to this basic program, the IBM supplied subroutines 

for computation of the beta-distribution were also used. 

The values chosen for N were 12, 16, 20, 40, 50, 80 and 

100. For each value of N, 5- varied through all integral 

N 

divisors of N such that 4 <. 5, <. ^ . For example, for N = 80 , 
Z took on the values 4, 8, 16, 20, and 40. 

2 

Initially, for each combination of N and Z, both and 

2 

a were varied from .1 to 2.0 in steps .1 and again from 1.0 

to 20.0 in steps of 1.0. Values were computed for the vari- 

~2 ''2 

ance and bias of a , the variance of a and the power of the 

d. Si 

specified test of hypothesis for all possible combinations of 
2 2 

N, Z, a , and o in the ranges described. 

0 0 

The data generated in this manner supported the conten- 

2 

tion of Scheffe's that is simply a scaling factor for both 

~2 ^2 . 

the bias and variance of and the variance of a^. Figures 

2 

1 and 2 illustrate the scaling influence of on the bias 

~2 

and variance of a when N = 20 and Z = 4 and 5. 



As for the power of the test of hypothesis, Scheffe' has 

2 2 

shown 3(9) to be dependent only upon the ratio o/o and 

Si G 

2 

again can be considered as a scaling factor. 

2 

Based on these considerations, o was set at one for all 

e 

data generated for use in this thesis. This greatly reduced 
the amount of time and output required for computer runs and 

further reduced the number of input variables to two, Z and 

2 2 2 . 

a for each fixed N. Further, if a =1/ the value of is 
a e a 
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li 





Figure 1. Graph showing the effects of the scaling 

2 ~2 

influence of o on the variance of o for N = 20. 
e a 

The three upper curves are for 1=4, and the 
three lower for = 5 . 
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Var (a^) 
a 




in each pair is for 1 = 4 , and the bottom curve 
for £ = 5. 
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also the value of the ratio of K = 




Since cr 



2 

e 



is a scal- 



e 

2 

ing factor, all conclusions drawn using = 1 are equally 

2 

valid for any other . The direction of change for fixed 
2 2 

is as fellows: as a increases (decreases) the variance 

a e 

and bias increase (decrease) and the power of the test of the 

test of hypothesis decreases (increases) . The magnitude of 

2 2 

the change depends on the magnitudes of a and o , N and 

0 0 

In order to evaluate the power of the test of hypothesis, 

it was necessary to choose a level of significance, alpha. 

An alpha of .05 was used throughout this paper. 

Wang [Ref. 8] has conducted a similar study of the bias 

~2 

and variance of several estimators, including a . Her study 
was restricted to the special case v;here and took on 
only even degrees of freedom and N took on values of 9, 27, 

81 and 225. There are no direct points of comparison between 
the data she generated and the data in this thesis. However, 
a very favorable comparison of computed variance and bias 
exists for value of N and t as nearly matching as is possible. 

Wang's variance and bias expressions for N=81, 1=9, and 

2 ~2 ~2 

0 =1.0 yield var (a ) = .309 and bias (a^) = 0 while the 

a. a. a. 

2 

data from this thesis for N = 80, Jl = 10 , and a = 1.0 yield 

a 

var (0^) = .282 and bias ( 0 ^) = 0. 
a a 
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III. PRESENTATION OF FINDINGS 



A. GENERAL OBSERVATIONS 

Before attempting to address the specific question of the 

selection of I, several general conclusions can be drawn from 

2 ^2 "^2 

the data regarding the bias of the variance of and 

3- a. a. 

and the power of the specified test of hypothesis. 

~ 2 

Generally speaking, the bias of a is very small. As 

a 

shown in Table 1, the bias decreases as K, the ratio of 
2 2 

o_/a^ increases. For small N and small K, the bias is sig- 
a 0 

nificant. However, if K is greater than 1.0 and N greater 

than or equal 20, the magnitude of the bias is so small that 

it can be neglected. For this range of N and K, the maximum 

value the bias assumed is less than one percent of the true 
2 ^2 

value of c . Thus, a is virtually unbiased in this range, 
a a 

-2 

The bias of was found to be negative for many combina- 
tions of N, I, and K. However, for the entire range of input 
variables for which data was generated, the negative bias was 

always insignificant to the fourth decimal place. 

~2 ^2 

As shown in Table II, the variances of a and a are very 

ci a. 

nearly the same except when K is small. For small values of 
K there is a significant difference between the two. However, 
this difference decreases sharply as K increases and is neg- 
ligible for K ^ 1.0. The difference between the variances 

is further decreased as N increases. Thus the variances of 
~2 '^2 

o and cr , appear to approach each other asymptotically as 

3 . 3 

N and/or K increase. 
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TABLE I 





EFFECT OF 


N AND K ON 


THE BIAS 


~2 

OF a 

a 






= 

a 


II 

** 


al = 1.0 

e 






N 


K 


BIAS 


N 


K 


BIAS 


12 


.1 


.0951 


40 


.1 


.0153 




.5 


.0478 




.5 


.0037 




1.0 


.0270 




1.0 


.0016 




2.0 


.0129 




2.0 


.0006 


16 


.1 


.0627 


80 


.1 


.0045 




.5 


.0265 




.5 


.0010 




1.0 


.0138 




1.0 


.0005 




2.0 


.0061 




o 

• 

(N 


.0001 


20 


.1 


.0452 


100 


.1 


.0029 




.5 


.0167 




.5 


.0004 




1.0 


.0081 




1.0 


.0002 




2.0 


.0035 




2.0 


.0001 



~2 

Table II also indicates that the variance of is always 

^2 

less than or equal the variance of a . The introduction of 

a 

a small amount of bias by truncation of the estimator tends 
to reduce the variance. 

As is to be expected, the power of the test of hypothesis 

increases as N increases. In the model proposed here, power 

is also a function of I when N is fixed. For all values of 

K tested, it was found that if N > 16 and I < 3(9) ^ .9996. 

N 

This implies that for values of N ^ 16 and I < the 
power criterion can be ignored in the selection of 1. Atten- 
tion can then be directed to minimizing variance and/or bias 
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TABLE II 



DIFFERENCES IN TRUNCATED (o^) AND UNTRUNCATED ( 

3 . 

VARIANCE ESTIMATORS FOR VARIOUS VALUES 



N 


K 


12 


0.1000 


12 


0.5000 


12 


1.0000 


12 


2.0000 


12 


5 .0000 


20 


0.1000 


20 


0.5000 


20 


1.0000 


20 


2.0000 


20 


5.0000 


40 


0.1000 


40 


0.5000 


40 


1.0000 


40 


2.0000 


40. 


5.0000 


100 


0.1000 


100 


0.5000 


100 


1.0000 


100 


2.0000 


100 


5.0000 



, K, AND 



i VAR (a 



4 


0.0648 


4 


0.3461 


4 


1.0158 


4 


3.3828 


4 


18.5601 


5 


0.0336 


5 


0.2376 


5 


0.7305 


5 


2.4754 


5 


13.7216 


8 


0.0170 


8 


0.1347 


8 


0.4093 


8 


1.3831 


8 


7.7275 


20 


0.0081 


20 


0.0525 


20 


0.1526 


20 


0.5105 


20 


2.8473 



) VAR (a^) 

0.1530 
0.4907 
1.2130 
3.6574 
18.9907 
0.0696 
0.2896 
0.7896 
2.5396 
13.7896 
0.0282 
0.1425 
0.4139 
1.3854 
7.7282 
0.0105 
0.0526 
0. 1526 
0.5105 
2 . 8473 
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Q > 



in selecting I with the assurance of a very strong test of 
the hypothesis. 

B. THE SELECTION OF I FOR FIXED N 

In the model being studied, it has been assumed that the 
number of observations of the random variable being observed 
is fixed at N. Further, it is assumed that r observations 
will be made on each of I classes of the observable phenom- 
enon so that £r = N. The problem now arises of how to choose 
I (or r) so as to obtain the best statistical results. The 
problem is complicated by the fact that the "best" solution 
is dependent on the desired result of the analysis. For ex- 
ample, the a that provides the most powerful test of hypothesis 

for a given N may very well produce maximum bias in our esti- 
2 

mate of a . In the same manner, the i that provides minimum 

d. 

bias or minimum variance in the estimator may produce a very 

2 

weak test of the hypothesis that a =0. 

The selection of £ is further complicated by the fact that 

all of the parameters of interest are dependent on K. 

Hammersley [Ref. 1] developed an expression for r which 

^ 2 

produces minimum variance in a , the unbiased estimator of 

Si 

2 

a . Equating the first derivative of the expression for the 

Si 

^ 2 

variance of to zero, Hammersley showed that the integral 

divisor of N that most nearly satisfies 

(K+1)N + 1 
^h KN + 2 

produces minimum variance in . For the range of N and £ 

used in this study, the value of r^^ also produce minimum 
. ~2 

variance for a 

a 
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i 



The proposed by Haitunersley has two unpleasant features. 

First, for some combinations of N and K, the power of the 

specified test of hypothesis is very low. For example, if 

2 

K = .5 and N = 12, the power of a test of = 0 is only 

.2561. 



The second and perhaps more serious feature is that 

Hammersley's solution for r^^ requires a knowledge of the ratio 
2 2 

a /a prior to conducting the intended analysis. In an envi- 

3 . 0 

ronment such as the flexural rigidity experiment where the 

2 2 

general magnitudes of a and a would be known from previous 
experiments on similar products this requirement may not be 
serious. However, for a one-time-only experiment, or an 
evaluation of a new process this requirement may be completely 
unreasonable . 

The results of the present study indicate that power is 
maximized for small I while variance is minimized when z as- 
sumes its maximum value of N/2 . But it has already been shown 

N 

that power is not a major consideration for N > 16 if 5 , 7 ^ ^ . 
It would appear then that I should be selected very near to 
but not equal to N/2. 

Based on these considerations it appears that 



= [| - 1 ] 



such that is an integer is the "best" choice for Z, that is, 
ilg is the next smaller integral divisor of N. As an example 
for N = 20 



Z 



g 




[9] = 5. 



is the best choice for 1. (See Table III.) 
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N 

Table III shows a comparison of = — ) and SL for 

h h r^ g 

various values of N. It also shows the power, variance, and 

bias generated for each choice of and in a range of K 

values. It can be seen that increases with K to its 

h 

maximum value of N/2 while is fixed for a given N. Also, 

when 5,^ = N/2, the power of the test is small for small N. 

In fact, for N as large as 100, the power using may be 

less than .9 while power for never falls below .9 for any 

~2 

N. As was expected, the variance of o using Z, is consid- 

d. il 

erably less than the variance acquired using Z^ since was 
derived as the minimum variance choice of Z. 

Generally speaking, the bias of the estimator when Z = Z^ 
is less than or equal the bias when 5- = 5-^. The only excep- 
tions to this being when K = .1 and N = 80 and 100. The bias 

for both a = 2-g and 2- = 5-^ is genereilly less than three per- 

2 

cent of the true value of a if K > 1.0 and less than one 

a “ 

percent for K ^ 1.0 and N ^ 20 . 

Again it seems that the method of selecting 2- depends on 

the desired results of the original analysis. 2,^^ will always 

2 

produce minimum variance in the estimate of but requires 

2 2 

a knowledge of the ratio K = o . If K is known and a 

3. G 

minimum variance estimator is desired, this is certainly the 
best method of choosing Z. 

If a powerful test of hypothesis is desired Z^ gives a 

much more powerful test for most combinations, of N and K than 

will 2, . If nothing is known of K, 2, gives a powerful test 

n q 

and a relatively small variance. 
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IV. CONCLUSIONS AND RECOMMENDATIONS 



A. CONCLUSIONS 

It may be concluded that with one exception, Z is the 

U 

"best" method of choosing the number of classes for Model II 
analysis of variance when N is fixed. The exception occurs 

in the case where K is known and a minimum variance estimator 

2 . . . 

of cr^ IS desired without regard to the power of the test of 

hypothesis that = 0. In this case appears best. The 

use of assures a very powerful test of hypothesis and 

will yield a small, but not minimum variance in the estimator. 

For most combinations of N and K, also produces minimum 
~ 2 

bias in o . 

a 

~2 

If N > 20 and K > 1.0, the bias of a is so small as to 

be negligible. In such cases, the use of the truncated esti- 
2 

mator of a has no significant influence on the results of 

3 . 

the analysis except to cause a small decrease in variance. 

As N and/or K increase, this decrease in variance appears to 
tend toward zero. 

B. RECOMMENDATIONS FOR FURTHER STUDY 

It is suggested that a similar study of variance esti- 
mators be conducted for value of N greater than 100 for the 
full range of K values studied here. Such a study might also 
investigate values of K less than the minimum value of .1 
used in this study. 
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A much more difficult task that could follow the same 



general approach would be an investigation of two-way and 
multi-way analysis in an effort to determine the best number 
of experiments for each class to provide minimiam variance in 
the variance estimators and maximum power for a specified 
test of hypothesis. 
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APPENDIX A 



THIS PROGRAM IS DESIGNED TO COMPUTE THE VARIANCE, POWER 
AND BIAS FOR THE Y+ ESTIMATOR OF THE BETWEEN CLASS VARIANCE 
FOR THE BALANCED, ONE-WAY ANALYSIS OF VARIANCE, MODEL II. 



EXPLANATION OF SYMBOLS ; 

VAR IS THE TRUE WITHIN CLASS VARIANCE. 

VARA IS THE TRUE BETWEEN CLASS VARIANCE. 

L IS THE NUMBER OF CLASSES. 

IR IS THE NUMBER OF EXPERIMENTS IN EACH CLASS. 

XK IS THE RATIO OF THE BETWEEN CLASS AND WITHIN CLASS 
VARIANCES . 

N IS THE TOTAL NUMBER OF EXPERIMENTS. N=L*IR. 

VART IS THE Y+ , OR POSITIVE TRUNCATION, OF THE ESTIMATE OF 
THE BETWEEN CLASS VARIANCE. 

VARR IS THE MINIMUM VARIANCE UNBIASED ESTIMATOR OF THE BE- 
TWEEN CLASS VARIANCE. 

XMEAN IS THE EXPECTED VALUE OF Y+. 

POW IS THE POWER OF THE TEST OF HYPOTHESIS THAT THE TRUE BE- 
TWEEN CLASS VARIANCE IS ZERO. 

XC IS THE F-STATISTIC FOR ALPHA=.05 AND L-1 AND L*(IR-1) 
DEGREES OF FREEDOM, USED IN COMPUTING THE POWER. 

THE SUBROUTINE BDTR COMPUTES THE PROBABILITY THAT THE RANDOM 
VARIABLE U, DISTRIBUTED ACCORDING TO THE BETA-DISTRIBUTION 
WITH PARAMETERS A AND B, IS LESS THAN OR EQUAL TO X; 
BDTR(X,A,B) 

THE FUNCTION EYPLUS COMPUTES THE EXPECTED VALUE OF Y+ . 

THE FUNCTION VYPLUS COMPUTES THE VARIANCE OF Y+ . 
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1002 READ(5,105)L,IR,XC 
VAR=1. 

VARA=0 . 

IF(L) 1000,1000,1001 
1001 WRITE (6, 100) 

XN=L-1 
XM=L* (IR-1) 

DO 3000 1=1,20 
VARA=FLOAT(I) 

A= (VAR+FLOAT (IR) *VARA) /FLOAT (IR* (L-1) ) 

B=VAR/FLOAT (L*IR* (IR-1) ) 

XMEAN=EYPLUS (XM, XN , A , B ) 

BIAS=XMEAN-VARA 

VART=WPLU S ( XM , XN , A , B , XME AN ) 

VARR= ( 2 . /FLOAT ( IR* *2 ) ) * ( (VAR+FLOAT ( IR) *VARA) * * 2 / 
IFLOAT (L-1) +VAR**2/FLOAT (L* (IR-1) ) ) 
X=1./(1+XN*XC*VAR/(XM* (VAR+FLOAT ( IR) *VARA) ) ) 

CALL BDTR(X,XN,XM,P,D,IER) 

POW=P 

N=L*IR 

XK=VARA/VAR 

WRITE (6 , 10 1) N , L , XK , VARA, POW, VARR, VART , BIAS 
GO TO 1002 
1000 CONTINUE 

100 FORMAT (' N L XK VARA POWER VAR(UNTRUN) 

1 BIAS(TRUN) '//' ’) 

101 FORMATC * ,I3,2X,I2,3F8,4,2X,F8.4,2X,F8.4,2X,F8.4) 
105 FORMAT (12 ,14, F6 .2) 

END 



FUNCTION EYPLUS (XM,XN,A,B) 
C=A/(A+B) 

H=XM/2 . 

E=XN/2 . 

CALL BDTR(C,H,E,P,D,IER) 

BET=P 

F=H+1. 

G=E-1. 

IF(G) 5,5 ,10 
5 EYPLUS=0. 

RETURN 

10 CALL BDTR(C,F,G,P,D,IER) 
BET1=P 

EYPLUS=XN*A*BET-XM*B*BET1 

RETURN 

END 
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FUNCTION VYPLUS (XM,XN,A,B,XMEAN) 

IF(XMEAN) 10,5,10 
5 EYPLUS=0. 

RETURN 
10 C=XM/2. 

H=XN/2 . 

E=A/(A+B) 

F=XN*A**2+XM*B**2 
CALL BDTR(E,C,H,P,D,IER) 

BET=P 

G=XN*A-XM*B 
YM=XM/2 . 

YN=XN/2 . 

GANM=YM+YN 

CALL GMMMA(YM,GX,IER) 

GAM=GX 

CALL GMMMA(YN,GX,IER) 

GAN=GX 

CALL GMMMA(GANM,GX,IER) 

COG=GAM*GAN/GX 

p=E**C 

R=(l.-E) **(H-1.) 

XK=COG*P*R*2 . *B 

VYPLUS=2.*F*BET+XMEAN+2. *XK*(A-B) -XMEAN**2 

RETURN 

END 
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